A gloss-centered algorithm for disambiguation
نویسندگان
چکیده
The task of word sense disambiguation is to assign a sense label to a word in a passage. We report our algorithms and experiments for the two tasks that we participated in viz. the task of WSD of WordNet glosses and the task of WSD of English lexical sample. For both the tasks, we explore a method of sense disambiguation through a process of “comparing” the current context for a word against a repository of contextual clues or glosses for each sense of each word. We compile these glosses in two different ways for the two tasks. For the first task, these glosses are all compiled using WordNet and are of various types viz. hypernymy glosses, holonymy mixture, descriptive glosses and some hybrid mixtures of these glosses. The “comparison” could be done in a variety of ways that could include/exclude stemming, expansion of one gloss type with another gloss type, etc. The results show that the system does best when stemming is used and glosses are expanded. However, it appears that the evidence for word-senses ,accumulated through WordNet, in the form of glosses, are quite sparse. Generating dense glosses for all WordNet senses requires a massive sense tagged corpus which is currently unavailable. Hence, as part of the English lexical sample task, we try the same approach on densely populated glosses accumulated from the training data for this task.
منابع مشابه
Structural semantic interconnection: a knowledge-based approach to Word Sense Disambiguation
In this paper we describe the SSI algorithm, a structural pattern matching algorithm for WSD. The algorithm has been applied to the gloss disambiguation task of Senseval-3.
متن کاملMRD-based Word Sense Disambiguation: Further#2 Extending#1 Lesk
This paper reconsiders the task of MRDbased word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact onWSD performance of different tokenisation schemes, scoring mechanisms, methods of gloss extension and filtering methods. In experimentation over the Lexeed Sensebank and the Japanese Senseval2 dictionary task, we demonstrate that character bigrams with sense-s...
متن کاملMRD-based Word Sense Disambiguation: Further Extending Lesk
This paper reconsiders the task of MRDbased word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact onWSD performance of different tokenisation schemes, scoring mechanisms, methods of gloss extension and filtering methods. In experimentation over the Lexeed Sensebank and the Japanese Senseval2 dictionary task, we demonstrate that character bigrams with sense-s...
متن کاملConcept sense disambiguation in concept maps using WordNet
In this report an unsupervised and knowledge-based algorithm for concept sense disambiguation in concept maps is proposed. Concept maps are graphical tools for organizing and representing knowledge, based on concepts and labeled interconnections among them, forming propositions. The disambiguation process is carried combining Magnini’s domain, context information and the gloss. It’s supported i...
متن کاملBaldwin, Timothy, Su Nam Kim, Francis Bond, Sanae Fujita, David Martinez and Takaaki Tanaka (2008) MRD-based Word Sense Disambiguation: Further Extending Lesk, In Proceedings of the Third International Joint Conference on Natural Language Processing (IJCNLP 2008), Hyderabad, India
This paper reconsiders the task of MRDbased word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact onWSD performance of different tokenisation schemes, scoring mechanisms, methods of gloss extension and filtering methods. In experimentation over the Lexeed Sensebank and the Japanese Senseval2 dictionary task, we demonstrate that character bigrams with sense-s...
متن کامل